Transcriptome profiling of human thymic and peripheral blood CD4 + and CD8+ T cells, using RNA-seq
Some basic info of the chosen expression dataset
| contact_address | contact_city | contact_country | contact_department | contact_email | contact_institute |
|---|---|---|---|---|---|
| Kirkeveien 166, Laboratoriebygget | OSLO | Norway | Department of Medical Genetics | b.a.lie@medisin.uio.no | OUS |
| celltype | tissue | patient | |
|---|---|---|---|
| CD4_thymus_1 | CD4 | thymus | 1 |
| CD4_thymus_2 | CD4 | thymus | 2 |
| CD4_thymus_3 | CD4 | thymus | 3 |
| CD4_thymus_4 | CD4 | thymus | 4 |
| CD4_bloodinfant_1 | CD4 | bloodinfant | 1 |
| CD4_bloodinfant_2 | CD4 | bloodinfant | 2 |
| CD4_bloodinfant_3 | CD4 | bloodinfant | 3 |
| CD4_bloodinfant_4 | CD4 | bloodinfant | 4 |
| CD4_bloodadult | CD4 | bloodadult | NA |
| CD8_thymus_1 | CD8 | thymus | 1 |
| CD8_thymus_2 | CD8 | thymus | 2 |
| CD8_thymus_3 | CD8 | thymus | 3 |
| CD8_thymus_4 | CD8 | thymus | 4 |
| CD8_thymus_5 | CD8 | thymus | 5 |
| CD8_bloodinfant_1 | CD8 | bloodinfant | 1 |
| CD8_bloodinfant_2 | CD8 | bloodinfant | 2 |
| CD8_bloodinfant_3 | CD8 | bloodinfant | 3 |
| CD8_bloodadult | CD8 | bloodadult | NA |
##
## The downloaded binary packages are in
## /var/folders/4w/5ghh1f910_zfbnfv6h5rst9w0000gn/T//Rtmp6IEueW/downloaded_packages
This is a short glaze of what we start with:
The 15 samples are kept, The cell-type category is composed of CD4 and CD8, and there are 8 CD4 samples and 7 CD8 samples in the dataset. 2 samples of “bloodadult” tissue type was removed.
knitr::kable(normalized_count_data[c(1:5),c(1:5)], type="html")
| ensembl_gene_id | hgnc_symbol | CD4_thymus_1 | CD4_thymus_2 | CD4_thymus_3 | |
|---|---|---|---|---|---|
| ENSG00000000003 | ENSG00000000003 | TSPAN6 | 2.750699 | 4.395522 | 1.618822 |
| ENSG00000000419 | ENSG00000000419 | DPM1 | 37.402365 | 31.998891 | 36.502047 |
| ENSG00000000457 | ENSG00000000457 | SCYL3 | 18.826214 | 20.721749 | 23.585623 |
| ENSG00000000460 | ENSG00000000460 | C1orf112 | 34.115816 | 23.810148 | 31.870986 |
| ENSG00000000938 | ENSG00000000938 | FGR | 3.268688 | 4.664636 | 2.780002 |
Ranked genes with Top p-value
This is the resulting table with ranked genes with Top p-value
knitr::kable(output_hits2[1:5,],type="html")
| ensembl_gene_id | hgnc_symbol | logFC | AveExpr | t | P.Value | adj.P.Val | B | |
|---|---|---|---|---|---|---|---|---|
| 9800 | ENSG00000172116 | CD8B | 278.623826 | 140.008071 | 21.51238 | 0 | 1.00e-07 | 4.884591 |
| 7408 | ENSG00000153563 | CD8A | 486.339551 | 236.540282 | 15.09044 | 0 | 4.50e-06 | 4.271212 |
| 6043 | ENSG00000138835 | RGS3 | 41.451709 | 26.976475 | 13.07518 | 0 | 1.59e-05 | 3.913343 |
| 2600 | ENSG00000106028 | SSBP1 | -46.072426 | 85.107571 | -12.97505 | 0 | 1.59e-05 | 3.892086 |
| 13955 | ENSG00000237506 | RPSAP15 | -2.103834 | 1.160512 | -11.91344 | 0 | 3.79e-05 | 3.641291 |
Short summary of # of genes that pass the thresholds
Calculate differential expression using the Quasi liklihood model with coef=“cell-type”
|
|
|
|
Short summary of genes that pass the threshold
Using the MeanVar Plot to explore the mean-variance relationship. Only the trended dispersion is used under the quasilikelihood pipeline.
Calculate dispersion to observe variance deviates from the mean. Biological Coefficient of Variation (BCV) and dispersion is a measure of how much variation there in the samples.
Here is a summary of the number of Up regulated and down regulated genes:
HeatMaps of top hit genes using Limma Model#2 (accounting for cell-type and tissue variability)
Generally we can observe the same cell type with same tissue type samples are definitely clustered in groups.
By reading the first heatmap we can observe that with in the Cell type groups, different tissue type shows differences in there gene expression pattern, with the second heatmap, within the same tissue, the gene of different cell type also express differently.
Which method did you choose and why?
I used g:profiler for the thresholded gene set enrichment analysis.
What annotation data did you use and why? What version of the annotation are you using?
I chose GO:BP, KEGG and REAC as my annotation data.
GO biological process: releases/2019-07-01 KEGG: KEGG FTP Release 2019-09-30 Reactome: ensembl classes: 2019-10-2
How many genesets were returned with what thresholds? The threshold is 0.05.
Upregulated genes: 1353 go bioligical process terms, 110 KEGG terms and 103 REAC terms
Downregulated genes: 657 go bioligical process terms,22 KEGG terms and 376 REAC terms
Diff genes: 538 go bioligical process terms, 31 KEGG terms and 63 REAC terms
Run the analysis using the up-regulated set of genes, and the down-regulated set of genes separately. How do these results compare to using the whole list (i.e all differentially expressed genes together vs. the up-regulated and down regulated differentially expressed genes separately)?
#####Up-Regulated Genes
#####Down-Regulated Genes
Download the “Human_GOBP_AllPathways_no_GO_iea_February_01_2020_symbol.gmt” from bader lab link: “http://download.baderlab.org/EM_Genesets/current_release/Human/symbol/”
Required field Gene sets database:Human_GOBP_AllPathways_no_GO_iea_February_01_2020_symbol.gmt Number of permutations:1000 collapse/Remap to gene symbols:No_Collapse
Basic fields Enrichment statistic: weighted Max size: 200 Min size: 15
Enrichment in phenotype: na + 2310 / 3211 gene sets are upregulated in phenotype na_pos + 1158 gene sets are significant at FDR < 25% + 522 gene sets are significantly enriched at nominal pvalue < 1% + 855 gene sets are significantly enriched at nominal pvalue < 5%
Enrichment in phenotype: na
The top hit for upregulated gene is IMMUNOREGULATORY INTERACTIONS BETWEEN A LYMPHOID AND A NON-LYMPHOID CELL%REACTOME%R-HSA-198933.5
ES = 0.68 NES = 3.38 FDR q-val= 0.000 size = 74 genes
The top hit for downregulated gene is TRANSLATION%REACTOME DATABASE ID RELEASE 71%72766
ES = -0.62 NES = -4.55 FDR q-val = 0.000 size = 129 genes
The top largest annotated clusters are:
Do the enrichment results support conclusions or mechanism discussed in the original paper? How do these results differ from the results you got from Assignment #2 thresholded methods?
Yes the enrichment result seems matching with the mechanism discussed in the paper such that the up regulated genes are falls into the categories of “IMMUNOREGULATORY INTERACTIONS BETWEEN A LYMPHOID AND A NON-LYMPHOID CELL”